Certainty upon Empirical Distributions

نویسنده

  • Joan Garriga
چکیده

We address the problem of assessing the information conveyed by a finite discrete probability distribution, within the context of knowledge discovery. Our approach is based on two main axiomatic intuitions: (i) the minimum information is given in the case of a uniform distribution, and (ii) knowledge is akin to a notion of richness, related to the dimension of the distribution. From this perspective, we define a statistic that has a clear interpretation in terms of a measure of certainty, and we build up a plausible hypothesis, that offers a comprehensible insight of knowledge, with a consistent algebraic structure. This includes a native value for the uncertainty related to unseen events. Our contributions are then faced up with entropy based measures. Finally, by implementing our measure in a decision tree induction algorithm, we show an empirical validation of the behavior of our measure with respect to entropy. Our conclusion is that the contributions of our measure are significant, and should lead to more robust models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

"Certainty" and expert mental health opinions in legal proceedings.

This pilot study addresses the legal and scientific ramifications of the "certainty" expressed by mental health professionals when functioning as expert witnesses in criminal and civil proceedings. The sporadic attention paid to "certainty" in the professional literature has typically taken the form of general policy oriented analyses as opposed to empirical, data-driven investigations. In the ...

متن کامل

The Stock Returns Volatility based on the GARCH (1,1) Model: The Superiority of the Truncated Standard Normal Distribution in Forecasting Volatility

I n this paper, we specify that the GARCH(1,1) model has strong forecasting volatility and its usage under the truncated standard normal distribution (TSND) is more suitable than when it is under the normal and student-t distributions. On the contrary, no comparison was tried between the forecasting performance of volatility of the daily return series using the multi-step ahead forec...

متن کامل

EMPIRICAL BAYES ANALYSIS OF TWO-FACTOR EXPERIMENTS UNDER INVERSE GAUSSIAN MODEL

A two-factor experiment with interaction between factors wherein observations follow an Inverse Gaussian model is considered. Analysis of the experiment is approached via an empirical Bayes procedure. The conjugate family of prior distributions is considered. Bayes and empirical Bayes estimators are derived. Application of the procedure is illustrated on a data set, which has previously been an...

متن کامل

Empirical Bayes Estimators with Uncertainty Measures for NEF-QVF Populations

The paper proposes empirical Bayes (EB) estimators for simultaneous estimation of means in the natural exponential family (NEF) with quadratic variance functions (QVF) models. Morris (1982, 1983a) characterized the NEF-QVF distributions which include among others the binomial, Poisson and normal distributions. In addition to the EB estimators, we provide approximations to the MSE’s of t...

متن کامل

Skew-slash distribution and its application in topics regression

In many issues of statistical modeling, the common assumption is that observations are normally distributed. In many real data applications, however, the true distribution is deviated from the normal. Thus, the main concern of most recent studies on analyzing data is to construct and the use of alternative distributions. In this regard, new classes of distributions such as slash and skew-sla...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011